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(54) VOICE TRANSMISSION DEVICE 

(57)Abstract: 

PROBLEM TO BE SOLVED: To provide a voice transmission device 
capable of realizing a clock difference absorbing function with higher 
voice quality and high precision at a low cost, which can absorb a 
difference of clock signals between a transmitter side device and a 
receiver side device. 

SOLUTION: A buffer control section 15 inserts a silence voice signal 
generated by a silence coding voice signal generating section 1 3 to a 
voice signal stored in a buffer section 14 on the basis of a storage 

amount of voice signals monitored by a buffer capacity monitor section JT 

16 and voiced/silence information detected by a voice detection section 
1 1 or aborts the silence voice signal from the voice signals stored in the 
buffer section 14. Thus, a change in the storage capacity of the voice 
signals stored in the buffer section 14 is reduced so as to eliminate 
defects in voice transmission due to a difference of clock signals 
between the transmitter side device and the receiver side device. 
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CLAIMS 



[Claim(s)] 

[Claim 1] With the IP packet receive section which extracts a sound signal from the IP packet which received 
The voice detecting element which detects the owner sound silent information which shows the owner sound 
silent section of the sound signal extracted by the above-mentioned IP packet receive section, The buffer 
section which accumulates the sound signal extracted by the above-mentioned IP packet receive section. With 
the buffer iVIoriitoring Department which supervises the accumulated dose of the sound signal accumulated in the 
above-mentioned buffer section It is based on the owner sound silent information detected by the accumulated 
dose and the above-mentioned voice detecting element of the sound signal supervised by the above-mentioned 
buffer Monitoring Department. The buffer control section which discards the sound signal which inserted a new 
sound signal in the sound signal accumulated in the above-mentioned buffer section, or was accumulated in the 
above-mentioned buffer section, Voice transmission equipment characterized by having the decode section 
which decodes the 2nd sound signal with which the sound signal was inserted or discarded by the above- 
mentioned buffer control section. 

[Claim 2] The silent sound signal generation section which generates the silent sound signal inserted in the 
sound signal accumulated in the above-mentioned buffer section. It has the marker grant section which gives the 
marker in which the above-mentioned owner sound silent information is shown to the sound signal extracted by 
the above-mentioned IP packet receive section. The above-mentioned buffer section The sound signal with 
which the marker was given by the above-mentioned marker grant section is accumulated. The above-mentioned 
buffer control section It is based on the marker given to the accumulated dose of the sound signal supervised by 
the above-mentioned buffer Monitoring Department, and the above-mentioned sound signal. The silent sound 
signal of the silent section of the sound signal which inserted the silent sound signal generated by the above- 
mentioned silent sound signal generation section at the silent section of the sound signal accumulated in the 
above-mentioned buffer section, or was accumulated in the above-mentioned buffer section is discarded. The 
above-mentioned decode section is voice transmission equipment according to claim 1 characterized by 
decoding the 2nd sound signal which removed the marker given to this sound signal from the sound signal with 
which the silent sound signal was inserted or discarded by the above-mentioned buffer control section. 
[Claim 3] When the accumulated dose of the sound signal accumulated in the above-mentioned buffer is below 
the lower limit set up beforehand, the above-mentioned buffer control section The silent sound signal generated 
by the above-mentioned silent sound signal generation section at the silent section of the sound signal 
accumulated in the above-mentioned buffer section is inserted. Voice transmission equipment according to claim 
2 characterized by discarding the silent sound signal of the silent section of the sound signal accumulated in the 
above-mentioned buffer section when it is more than the upper limit to which the accumulated dose of the 
above-mentioned sound signal was set beforehand. 

[Claim 4] It has the silent continuation test section which measures the duration of the silent section of the 
sound signal accumulated in the above-mentioned buffer. The above-mentioned buffer control section When the 
accumulated dose of the sound signal accumulated in the above-mentioned buffer is below the lower limit set up 
beforehand Based on the above-mentioned duration and the marker given to the above-mentioned sound signal, 
the silent sound signal generated by the above-mentioned silent sound signal generation section at the silent 
section of the sound signal accumulated in the above-mentioned buffer section is inserted. When it is more than 
the upper limit to which the accumulated dose of the sound signal accumulated in the above-mentioned buffer 
was set beforehand Voice transmission equipment according to claim 2 characterized by discarding the silent 
sound signal of the silent section of the sound signal accumulated in the above-mentioned buffer section based 
on the above-mentioned duration and the marker given to the above-mentioned sound signal. 
[Claim 5] The front hangover section constituted by the silent section from the time of changing to an owner 
sound since it was silent to before fixed time amount. It has the marker grant section which gives the marker in 
which the hangover section constituted by the silent section from the time of changing from an owner sound 
silently to fixed time amount after is shown to the above-mentioned sound signal. When the accumulated dose of 
the sound signal accumulated in the above-mentioned buffer is below the lower limit set up beforehand, the 



above-jTientioned buffer control section The silent sound signal generated by the above-mentioned silent sound 
signal generation section not at the above-mentioned front hangover section but at the silent section of a sound 
signal which is not the above-mentioned hangover section again is inserted. When it is more than two or more 
upper limits to which the accumulated dose of the sound signal accumulated in the above-mentioned buffer was 
set beforehand Voice transmission equipment according to claim 3 characterized by discarding the silent sound 
signal of the silent section instead of the above-mentioned front hangover section, the above-mentioned 
hangover section, or the above-mentioned front hangover section which is not the above-mentioned hangover 
section again according to two or more of these upper limits. 

[Claim 6] The received-data distinction section from which the sound signal extracted by the above-mentioned 
IP packet receive section distinguishes whether it is a sound signal by conversation, It has the selector which 
chooses the sound signal which chose signals other than the sound signal of the above-mentioned IP packet 
receive section based on the distinction result distinguished by the above-mentioned received-data distinction 
section, or was accumulated in the above-mentioned buffer. The above-mentioned decode section is voice 
transmission equipment according to claim 3 characterized by decoding the 3rd sound signal which removed the 
marker given to this sound signal from the sound signal with which it was based on the selection result of the 
above-mentioned selector, and the above-mentioned silent sound signal was inserted or discarded. 
[Claim 7] Based on the distinction result of the above-mentioned received-data distinction section, the sound 
signal of the above-mentioned IP packet receive section judges whether it is a facsimile signal. It has the 
facsimile protocol analysis section which analyzes the protocol of this facsimile signal when the above- 
mentioned sound signal is a facsimile signal. The above-mentioned buffer control section When the facsimile 
signal is accumulated in the above-mentioned buffer section On the protocol of the facsimile signal accumulated 
in the above-mentioned buffer section based on the analysis information analyzed by the above-mentioned 
facsimile protocol analysis section. Voice transmission equipment according to claim 6 characterized by inserting 
the above-mentioned silent sound signal in the silent section of the sound signal which is satisfactory even if it 
performs insertion or abandonment of the above-mentioned silent sound signal, or discarding the silent sound 
signal of the silent section of the above-mentioned sound signal. 

[Claim 8] With the IP packet receive section which extracts a sound signal from the IP packet which received 
The decode section which decodes the sound signal extracted by the above-mentioned IP packet receive 
section. The voice detecting element which detects the owner sound silent information which shows the owner 
sound silent section from the 3rd sound signal decoded by the above-mentioned decode section, The buffer 
section which accumulates the 3rd sound signal decoded by the above-mentioned decode section, With the 
buffer Monitoring Department which supervises the accumulated dose of the 3rd sound signal accumulated in the 
above-mentioned buffer section It is based on the owner sound silent information detected by the accumulated 
dose of the 3rd sound signal supervised by the above-mentioned buffer Monitoring Department, and the above- 
mentioned voice detecting element. Voice transmission equipment characterized by having the buffer control 
section which discards the 3rd sound signal which inserted a new sound signal in the 3rd sound signal 
accumulated in the above-mentioned buffer section, or was accumulated in the above-mentioned buffer section. 
[Claim 9] It has the marker grant section which gives the marker in which owner sound silent information is 
shown to the 3rd sound signal decoded by the silent sound signal generation section which generates the silent 
sound signal inserted in the 3rd sound signal accumulated in the above-mentioned buffer section, and the above- 
mentioned decode section. The above-mentioned buffer section accumulates the 3rd sound signal with which the 
marker was given by the above-mentioned marker grant section. The above-mentioned buffer control section It 
is based on the marker given to the accumulated dose of the 3rd sound signal supervised by the above- 
mentioned buffer Monitoring Department, and the 3rd sound signal of the above. Voice transmission equipment 
according to claim 8 characterized by discarding the silent sound signal of the silent section of the 3rd sound 
signal which inserted the silent sound signal generated by the above-mentioned silent sound signal generation 
section at the silent section of the 3rd sound signal accumulated in the above-mentioned buffer section, or was 
accumulated in the above-mentioned buffer section. 

[Claim 10] When the accumulated dose of the 3rd sound signal accumulated in the above-mentioned buffer is 
below the lower limit set up beforehand, the above-mentioned buffer control section The silent sound signal 
generated by the above-mentioned silent sound signal generation section at the silent section of the 3rd sound 
signal accumulated in the above-mentioned buffer section is inserted. Voice transmission equipment according to 
claim 9 characterized by discarding the silent sound signal of the silent section of the 3rd sound signal 
accumulated in the above-mentioned buffer section when it is more than the upper limit to which the 
accumulated dose of the 3rd sound signal of the above was set beforehand. 

[Claim 11] When the accumulated doses of the 3rd sound signal accumulated in the above-mentioned buffer are 
below two or more lower limits set up beforehand, the above-mentioned buffer control section The silent sound 
signal generated by the above-mentioned silent sound signal generation section at the silent section of the 3rd 
sound signal of the above according to two or more of these lower limits is inserted. Moreover, voice 
transmission equipment according to claim 9 characterized by discarding the silent sound signal of the silent 



sectioa of the 3rd sound signal of the above according to two or more of these upper limits when it is more than 
two or more upper limits to which the accumulated dose of the 3rd sound signal accumulated in the above- 
mentioned buffer was set beforehand. 

[Claim 12] The above-mentioned buffer control section is voice transmission equipment according to claim 1 1 
characterized by performing interpolation processing which inserts the signal of the signal lack part of the owner 
sound section of the 3rd sound signal of the above when there is no silent section in the 3rd sound signal which 
two or more above-mentioned lower limits are below the minimum lower limits, and was accumulated in the 
above-mentioned buffer. 

[Claim 13] It has the silent continuation test section which measures the duration of the silent section of the 3rd 
sound signal accumulated in the above-mentioned buffer. The above-mentioned buffer control section When the 
accumulated dose of the 3rd sound signal accumulated in the above-mentioned buffer is below the lower limit 
set up beforehand It is based on the above-mentioned duration and the marker given to the 3rd sound signal of 
the above. The silent sound signal generated by the above-mentioned silent sound signal generation section at 
the silent section of the 3rd sound signal accumulated in the above-mentioned buffer section is inserted. When it 
is more than the upper limit to which the accumulated dose of the 3rd sound signal accumulated in the above- 
nrientioned buffer was set beforehand Voice transmission equipment according to claim 9 characterized by 
discarding the silent sound signal of the silent section of the 3rd sound signal accumulated in the above- 
mentioned buffer section based on the above-mentioned duration and the marker given to the 3rd sound signal 
of the above. 

[Claim 14] The front hangover section constituted by the silent section from the time of changing to an owner 
sound since it was silent to before fixed time amount, It has the marker grant section which gives the marker in 
which the hangover section constituted by the silent section from the time of changing from an owner sound 
silently to fixed time amount after is shown to the 3rd sound signal of the above. When the accumulated dose of 
the 3rd sound signal accumulated in the above-mentioned buffer is below the lower limit set up beforehand, the 
above-mentioned buffer control section The above-mentioned silent sound signal is inserted in the silent section 
of the 3rd sound signal which is not the above-mentioned hangover section again instead of the above- 
mentioned front hangover section. When it is more than two or more upper limits to which the accumulated dose 
of the 3rd sound signal accumulated in the above-mentioned buffer was set beforehand Voice transmission 
equipment according to claim 10 characterized by discarding the silent sound signal of the silent section instead 
of the above-mentioned front hangover section, the above-mentioned hangover section, or the above-mentioned 
front hangover section which is not the above-mentioned hangover section again according to two or more of 
these upper limits. 

[Claim 15] Voice transmission equipment according to claim 10 characterized by to have the received-data 
distinction section from which the 3rd sound signal decoded by the above-mentioned decode section 
distinguishes whether it is a sound signal by conversation, and the selector which chooses the 3rd sound signal 
which chose signals other than the 3rd [ of the above-mentioned IP packet receive section ] sound signal based 
on the distinction result distinguished by the above-mentioned received-data distinction section, or was 
accumulated in the above-mentioned buffer. 

[Claim 16] The 3rd sound signal decoded by the above-mentioned decode section based on the distinction result 
by the above-mentioned received-data distinction section judges whether it is a facsimile signal. It has the 
facsimile protocol analysis section which analyzes the protocol of this facsimile signal when the 3rd sound signal 
of the above is a facsimile signal. The above-mentioned buffer control section When the facsimile signal is 
accumulated in the above-mentioned buffer section On the protocol of the facsimile signal accumulated in the 
above-mentioned buffer section based on the analysis information analyzed by the above-mentioned facsimile 
protocol analysis section. Voice transmission equipment according to claim 15 characterized by inserting the 
above-mentioned silent sound signal in the silent section of the 3rd sound signal which is satisfactory even if it 
performs insertion or abandonment of the above-mentioned silent sound signal, or discarding the silent sound 
signal of the silent section of the 3rd sound signal of the above. 
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DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[Field of the Invention] This invention relates to the voice transmission equipment which transmits signals, such 

as a sound signal, using an IP packet. 

[0002] 

[Description of the Prior Art] Since the synchronization of the transmission clock between equipment is not 
taken into consideration when realizing conventionally service as which real-time transmission of Internet-based 
phone services etc. is required in the voice transmission equipment through IP network, there is a problem which 
the excess and deficiency of received data generate in receiving-side equipment according to the error of the 
clock between equipment. In order to solve this problem, the clock with a very high precision is used as a clock 
mounted in equipment. 

[0003] Moreover, drawing 1 9 is the block diagram of the conventional voice transmission equipment shown in 
JP,5-103012,A. The connection input of the output signal 126 of the packetized voice inverse transformation 
section 101 to which, as for the conventional voice transmission equipment of drawing 1 9 , the packetized voice 
input line 1 1 1 was connected is carried out at the compensation pattern generation section 104, the owner 
sound silent judging section 102, and a selection circuitry 105. The connection input also of the output signal 127 
of the above-mentioned compensation pattern generation section 104 is carried out at this selection circuitry 
105, and the connection input also of the selection signal 122 of the queue loading control section 103 is carried 
out further. The connection input of the judgment result signal 123 of the above-mentioned owner sound silent 
judging section 102 is carried out at the queue loading control section 103. The connection input of the output 
signal 125 of a selection circuitry 105 is carried out to the voice frame queue 106, and this voice frame queue 
106 outputs that voice frame to the voice frame output line 112. The connection input of the loading indication 
signal 120 of the above-mentioned queue loading control section 103 is carried out in the voice frame queue 106 
and the counting circuit 107. moreover, the counting circuit 107 — the counting — it outputs to the queue 
loading control section 103 by making a result into the counted value signal 124. The timing generating section 
108 outputs a timing signal 121 to this counting circuit 107 and the voice frame queue 106. 
[0004] Next, the actuation is explained. The packetized voice inverse transformation section 101 carries out 
playback conversion of the voice frame from the packetized voice inputted from the packetized voice input line 
1 1 1, and outputs it to the owner sound silent judging section 102, the compensation pattern generation section 
104. and a selection circuitry 105, respectively. The owner sound silent judging section 102 judges the silent 
condition / owner sound condition of the received voice frame, then, the owner sound — the owner sound silent 
judging section 102 outputs a silent judgment result to the queue loading control section 103 as a judgment 
result signal 123. The queue loading control section 103 is held until this judgment result signal 123 is inputted 
into the judgment result signal 123 of the following voice frame. 

[0005] The compensation pattern generation section 104 generates a compensation voice frame from the voice 
frame inputted from the packetized voice inverse transformation section 101, and outputs it to a selection 
circuitry 1 05. Then, a selection circuitry 1 05 outputs a compensation voice frame and the voice frame of the 
packetized voice inverse transformation section 101 to the selection switch ****** frame queue 106 based on 
the selection signal 122 from the queue loading control section 103. The voice frame queue 106 loads the voice 
frame from this selection circuitry 105 into a queue. At this time, a counting circuit 107 counts up counted value 
every [ 1 ] at every the loading actuation of this. 

[0006] The voice frame queue 106 is outputted to the codec side which is not illustrated through the voice frame 
output line 112 one by one from the voice frame loaded in the queue at the ** point according to the timing 
signal 121 which has the fixed period which the timing generating section 108 generates. If this voice frame 
queue 106 outputs a voice frame, a counting circuit 107 will count down counted value every [ 1 ] with a timinc 
signal 121. 

[0007] discernment of that the queue loading control section 103 is below the minimum threshold number B to 
which the counted value signal 124 inputted from a counting circuit 107 was monitored continuously, and the 



counted.value (getting it blocked ~ the residual total of the voice frame loaded into the queue in the voice 
frame queue 106) was given beforehand judges whether the voice frame of the last of the voice frame queue 106 
was silent. And if silent, a selection circuitry 105 will be controlled to choose the compensation voice frame 
which the compensation pattern generation section 1 04 outputs, and to make it output to the voice frame queue 
106. And the queue loading control section 103 outputs the loading indication signal 120 into which the 
compensation voice frame is made to load to the voice frame queue 106. Loading compensation processing to 
the voice frame queue 106 of the compensation voice frame of the compensation pattern generation section 104 
is not performed until it will load the voice frame from the packetized voice inverse transformation section 101 
to the voice frame queue 106 and a silent frame will be inputted contrary to this, if the last voice frame of the 
voice frame queue 106 is an owner sound. Then, the queue loading control section 103 controls this actuation 
until the counted value signal 124 exceeds the minimum threshold number B. 

[0008] moreover, the voice frame which is below the upper limit threshold number A to which the value of the 
counted value signal 124 was given beforehand and into which the queue loading control section 103 will be 
inputted from the packetized voice inverse transformation section 101 if it becomes — an owner sound — it 
loads with a selection signal 122 and an indication signal 120 is outputted so that it may be made to load 
[ silently ] to the voice frame queue 106 through a selection circuitry 105 unconditionally not related. 
[0009] However, if the value of the counted value signal 124 exceeds the upper limit threshold number A. the 
queue loading control section 103 will control a selection circuitry 105 to load only the silent voice frame from 
the packetized voice inverse transformation section 101 to the voice frame queue 106. That is. whenever a 
queue loading voice frame number exceeds the upper limit threshold number A, it will not carry out canceling the 
owner sound voice frame from the packetized voice inverse transformation section 101. and loading to the voice 
frame queue 106. 

[Problem(s) to be Solved by the Invention] 

[0010] In the conventional voice transmission equipment, in order to use a clock with a very high precision, there 
was a problem that equipment cost became high. Moreover, since the input of the queue buffer for absorbing a 
clock error is controlling, when it is in the problem that where of control is impossible about the sound signal 
accumulated in the queue buffer, the problem, from which queue buffer control becomes impossible when a silent 
voice frame does not appear after exceeding a threshold, and the inclination for a queue buffer to overflow, there 
was a problem that where of the owner sound voice frame in which a significant sound signal is included is 
canceled. Moreover, since it was control in a voice frame unit with a certain time amount length, there was a 
problem that control in a unit smaller than a voice frame unit was impossible. 

[001 1] It was made in order that this invention might solve the above troubles, and it aims at obtaining a high 
voice speech quality and the voice transmission equipment which realizes highly precise clock error absorption 
cheaply. 
[0012] 

[Means for Solving the Problem] With the IP packet receive section which extracts a sound signal from the IP 
packet which the 1st invention received The voice detecting element which detects the owner sound silent 
information which shows the owner sound silent section of the sound signal extracted by the above-mentioned 
IP packet receive section. The buffer section which accumulates the sound signal extracted by the above- 
mentioned IP packet receive section. With the buffer Monitoring Department which supervises the accumulated 
dose of the sound signal accumulated in the above-mentioned buffer section It is based on the owner sound 
silent information detected by the accumulated dose and the above-mentioned voice detecting element of the 
sound signal supervised by the above-mentioned buffer Monitoring Department. It has the buffer control section 
which discards the sound signal which inserted a new sound signal in the sound signal accumulated in the above- 
mentioned buffer section, or was accumulated in the above-mentioned buffer section, and the decode section 
which decodes the 2nd sound signal with which the sound signal was inserted or discarded by the above- 
mentioned buffer control section. 

[0013] The silent sound signal generation section which generates the silent sound signal which inserts the 2nd 
invention in the sound signal accumulated in the above-mentioned buffer section. It has the marker grant section 
which gives the marker in which the above-mentioned owner sound silent information is shown to the sound 
signal extracted by the above-mentioned IP packet receive section. The above-mentioned buffer section The 
sound signal with which the marker was given by the above-mentioned marker grant section is accumulated. The 
above-mentioned buffer control section It is based on the marker given to the accumulated dose of the sound 
signal supervised by the above-mentioned buffer Monitoring Department, and the above-mentioned sound signal. 
The silent sound signal of the silent section of the sound signal which inserted the silent sound signal generated 
by the above-mentioned silent sound signal generation section at the silent section of the sound signal 
accumulated in the above-mentioned buffer section, or was accumulated in the above-mentioned buffer section 
is discarded. The above-mentioned decode section decodes the 2nd sound signal which removed the marker 
given to this sound signal from the sound signal with which the silent sound signal was inserted or discarded by 
the above-mentioned buffer control section. 



[0014] Vyhen the .accumulated dose of the sound signal accumulated in the above-mentioned buffer is below the 
lower limit set up beforehand, the 3rd invention The silent sound signal generated by the above-mentioned silent 
sound signal generation section at the silent section of the sound signal accumulated in the above-mentioned 
buffer section is inserted. When it is more than the upper limit to which the accumulated dose of the above- 
mentioned sound signal was set beforehand, it has the buffer control section which discards the silent sound 
signal of the silent section of the sound signal accumulated in the above-mentioned buffer section. 
[0015] The 4th invention is equipped with the silent continuation test section which measures the duration of the 
silent section of the sound signal accumulated in the above-mentioned buffer. The above-mentioned buffer 
control section When the accumulated dose of the sound signal accumulated in the above-mentioned buffer is 
below the lower limit set up beforehand Based on the above-mentioned duration and the marker given to the 
above-mentioned sound signal, the silent sound signal generated by the above-mentioned silent sound signal 
generation section at the silent section of the sound signal accumulated in the above-mentioned buffer section 
is inserted. When it is more than the upper limit to which the accumulated dose of the sound signal accumulated 
in the above-mentioned buffer was set beforehand, based on the above-mentioned duration and the marker 
given to the above-mentioned sound signal, the silent sound signal of the silent section of the sound signal 
accumulated in the above-mentioned buffer section is discarded. 

[0016] The front hangover section constituted by the silent section from the time of changing to an owner sound 
since the 5th invention was silent to fixed time amount Saki, It has the marker grant section which gives the 
marker in which the hangover section constituted by the silent section from the time of changing from an owner 
sound silently to fixed time amount after is shown to the above-mentioned sound signal. When the accumulated 
dose of the sound signal accumulated in the above-mentioned buffer is below the lower limit set up beforehand, 
the above-mentioned buffer control section The silent sound signal generated by the above-mentioned silent 
sound signal generation section not at the above-mentioned front hangover section but at the silent section of a 
sound signal which is not the above-mentioned hangover section again is inserted. When it is more than two or 
more upper limits to which the accumulated dose of the sound signal accumulated in the above-mentioned buffer 
was set beforehand According to two or more of these upper limits, the silent sound signal of the silent section 
which is not the above-mentioned front hangover section, the above-mentioned hangover section, or the above- 
mentioned front hangover section, and is not the above-mentioned hangover section again is discarded. 
[0017] The received-data distinction section from which the sound signal with which the 6th invention was 
extracted by the above-mentioned IP packet receive section distinguishes whether it is a sound signal by 
conversation, It has the selector which chooses the sound signal which chose signals other than the sound signal 
of the above-mentioned IP packet receive section based on the distinction result distinguished by the above- 
mentioned received-data distinction section, or was accumulated in the above-mentioned buffer. The above- 
mentioned decode section decodes the 3rd sound signal which removed the marker given to this sound signal 
from the sound signal with which it was based on the selection result of the above-mentioned selector, and the 
above-mentioned silent sound signal was inserted or discarded. 

[0018] Based on the distinction result of the above-mentioned received-data distinction section, as for the 7th 
invention, the sound signal of the above-mentioned IP packet receive section judges whether it is a facsimile 
signal. It has the facsimile protocol analysis section which analyzes the protocol of this facsimile signal when the 
above-mentioned sound signal is a facsimile signal. The above-mentioned buffer control section When the 
facsimile signal is accumulated in the above-mentioned buffer section On the protocol of the facsimile signal 
accumulated in the above-mentioned buffer section based on the analysis information analyzed by the above- 
mentioned facsimile protocol analysis section, Even if it performs insertion or abandonment of the above- 
mentioned silent sound signal, the above-mentioned silent sound signal is inserted in the silent section of a 
satisfactory sound signal, or the silent sound signal of the silent section of the above-mentioned sound signal is 
discarded. 

[0019] With the IP packet receive section which extracts a sound signal from the IP packet which the 8th 
invention received The decode section which decodes the sound signal extracted by the above-mentioned IP 
packet receive section. The voice detecting element which detects the owner sound silent information which 
shows the owner sound silent section from the 3rd sound signal decoded by the above-mentioned decode 
section, The buffer section which accumulates the 3rd sound signal decoded by the above-mentioned decode 
section. With the buffer Monitoring Department which supervises the accumulated dose of the 3rd sound signal 
accumulated in the above-mentioned buffer section It is based on the owner sound silent information detected 
by the accumulated dose of the 3rd sound signal supervised by the above-mentioned buffer Monitoring 
Department, and the above-mentioned voice detecting element. It has the buffer control section which discards 
the 3rd sound signal which inserted a new sound signal in the 3rd sound signal accumulated in the above- 
mentioned buffer section, or was accumulated in the above-mentioned buffer section. 

[0020] The 9th invention is equipped with the marker grant section which gives the marker in which owner sound 
silent information is shown to the 3rd sound signal decoded by the silent sound signal generation section which 
generates the silent sound signal inserted in the 3rd sound signal accumulated in the above-mentioned buffer 



section,, and the ^bove-mentioned decode section. The above-mentioned buffer section The 3rd sound signal 
wfth which the marker was given by the above-mentioned marker grant section is accumulated. The above- 
mentioned buffer control section It is based on the marker given to the accumulated dose of the 3rd sound signal 
supervised by the above-mentioned buffer Monitoring Department, and the 3rd sound signal of the above. The 
silent sound signal of the silent section of the 3rd sound signal which inserted the silent sound signal generated 
by the above-mentioned silent sound signal generation section at the silent section of the 3rd sound signal 
accumulated in the above-mentioned buffer section, or was accumulated in the above-mentioned buffer section 
is discarded. 

[0021] When the accumulated dose of the 3rd sound signal accumulated in the above-mentioned buffer is below 
the lower limit set up beforehand, the 10th invention The silent sound signal generated by the above-mentioned 
silent sound signal generation section at the silent section of the 3rd sound signal accumulated in the above- 
mentioned buffer section is inserted. When it is more than the upper limit to which the accumulated dose of the 
3rd sound signal of the above was set beforehand, it has the buffer control section which discards the silent 
sound signal of the silent section of the 3rd sound signal accumulated in the above-mentioned buffer section. 
[0022] When the accumulated doses of the 3rd sound signal accumulated in the above-mentioned buffer are 
below two or more lower limits set up beforehand, the 1 1th invention The silent sound signal generated by the 
above-mentioned silent sound signal generation section at the silent section of the 3rd sound signal of the above 
according to two or more of these lower limits is inserted. Moreover, when it is more than two or more upper 
limits to which the accumulated dose of the 3rd sound signal accumulated in the above-mentioned buffer was 
set beforehand, it has the buffer control section which discards the silent sound signal of the silent section of 
the 3rd sound signal of the above according to two or more of these upper limits. 

[0023] The 12th invention is equipped with the buffer control section which performs interpolation processing 
which inserts the signal of the signal lack part of the owner sound section of the 3rd sound signal of the above 
when there is no silent section in the 3rd sound signal which two or more above-mentioned lower limits are 
below the minimum lower limits, and was accumulated in the above-mentioned buffer. 

[0024] The 13th invention is equipped with the silent continuation test section which measures the duration of 
the silent section of the 3rd sound signal accumulated in the above-mentioned buffer. The above-mentioned 
buffer control section When the accumulated dose of the 3rd sound signal accumulated in the above-mentioned 
buffer is below the lower limit set up beforehand It is based on the above-mentioned duration and the marker 
given to the 3rd sound signal of the above. The silent sound signal generated by the above-mentioned silent 
sound signal generation section at the silent section of the 3rd sound signal accumulated in the above- 
mentioned buffer section is inserted. When it is more than the upper limit to which the accumulated dose of the 
3rd sound signal accumulated in the above-mentioned buffer was set beforehand, based on the above-mentioned 
duration and the marker given to the 3rd sound signal of the above, the silent sound signal of the silent section 
of the 3rd sound signal accumulated in the above-mentioned buffer section is discarded. 

[0025] The front hangover section constituted by the silent section from the time of changing to an owner sound 
since the 14th invention was silent to fixed time amount Saki. It has the marker grant section which gives the 
marker in which the hangover section constituted by the silent section from the time of changing from an owner 
sound silently to fixed time amount after is shown to the 3rd sound signal of the above. When the accumulated 
dose of the 3rd sound signal accumulated in the above-mentioned buffer is below the lower limit set up 
beforehand, the above-mentioned buffer control section The above-mentioned silent sound signal is inserted in 
the silent section of the 3rd sound signal which is not the above-mentioned hangover section again instead of 
the above-mentioned front hangover section. When it is more than two or more upper limits to which the 
accumulated dose of the 3rd sound signal accumulated in the above-mentioned buffer was set beforehand 
According to two or more of these upper limits, the silent sound signal of the silent section which is not the 
above-mentioned front hangover section, the above-mentioned hangover section, or the above-mentioned front 
hangover section, and is not the above-mentioned hangover section again is discarded. 

[0026] The 15th invention is equipped with the received-data distinction section from which the 3rd sound signal 
decoded by the above-mentioned decode section distinguishes whether it is a sound signal by conversation, and 
the selector which chooses the 3rd sound signal which chose signals other than the 3rd [ of the above- 
mentioned IP packet receive section ] sound signal based on the distinction result distinguished by the above- 
mentioned received-data distinction section, or was accumulated in the above-mentioned buffer 
[0027] The 3rd sound signal with which the 1 6th invention was decoded by the above-mentioned decode section 
based on the distinction result by the above-mentioned received-data distinction section judges whether it is a 
facsimile signal. It has the facsimile protocol analysis section which analyzes the protocol of this facsimile signal 
when the 3rd sound signal of the above is a facsimile signal. The above-mentioned buffer control section When 
the facsimile signal is accumulated in the above-mentioned buffer section On the protocol of the facsimile signal 
accumulated in the above-mentioned buffer section based on the analysis information analyzed by the above- 
mentioned facsimile protocol analysis section. Even if it performs insertion or abandonment of the above- 
mentioned silent sound signal, the above-mentioned silent sound signal is inserted in the silent section of the 3rd 



satisfactory sound signal, or the silent sound signal of the silent section of the 3rd sound signal of the above is 

discarded. 

[0028] 

[Embodiment of the Invention] The gestalt 1 of operation is explained with reference to drawing below gestalt 1. 
of operation. Drawing 1 is the block diagram of the voice transmission equipment of the gestalt 1 of operation. 
The IP packet receive section which extracts a coding sound signal from the voice IP packet which 10 received 
in drawing 1 , The voice detecting element which detects and judges the owner sound / silent condition of a 
coding sound signal that 1 1 was outputted from the IP packet receive section 10, The marker grant section 
which 12 gives the marker in which an owner sound / non-sound is shown to the coding sound signal from the IP 
packet receive section 10 based on the information from the voice detecting element 11, The non-note number- 
ized sound signal generation section which 13 generates a non-note number-ized sound signal with the 
directions from the buffer control section 1 5. and is outputted. The buffer section which accumulates the coding 
sound signal into which 14 is inputted through the marker grant section 12 temporarily, 15 inserts the non-note 
number-ized sound signal from the non-note number-ized sound signal generation section 13 to the coding 
sound signal accumulated into the buffer section 14 temporarily based on the information from the amount 
Monitoring Department 16 of buffers. Or the buffer control section which discards the coding sound signal in the 
buffer section 14, the amount Monitoring Department of buffers where 16 supervises the accumulated dose of 
the coding sound signal in a buffer, and 1 7 are the decode sections which decode the coding sound signal 
periodically outputted from the buffer section 1 4 with the clock in equipment. 

[0029] Next, actuation is explained. The coding sound signal stored in the voice IP packet in the IP packet 
receive section 10 is extracted, and the inputted voice IP packet is outputted to the voice detecting element 11 
and the marker grant section 12. In the voice detecting element 1 1, it detects and judges whether an applicable 
coding sound signal is in an owner sound condition, or it is in a silent condition from the voice level of the sound 
signal acquired by the voice coding parameter contained in a coding sound signal, or simple decode processing 
etc., and the result is outputted to the marker grant section 12. In the marker grant section 12, as opposed to 
the coding sound signal inputted from the IP packet receive section 10 based on the owner sound / silent 
information from the voice detecting element 1 1, the marker in which it is shown whether it is in an owner sound 
condition as header information of a coding sound signal or it is in a silent condition is given, and it outputs to 
the buffer section 1 4. 

[0030] In the non-note number-ized sound signal generation section 13, the non-note number-ized sound signal 
with which the same coding method as the coding sound signal inputted into the buffer section 14 according to 
the directions from the buffer control section 15 was given is generated, and it outputs to the buffer section 14. 
In the buffer section 14, after it accumulates the coding sound signal inputted through the marker grant section 
12 temporarily and abandonment and insertion of a coding sound signal are performed by the buffer control 
section 15, based on the clock of receiving-side equipment, the coding sound signal except marker information is 
periodically outputted to the decode section 1 7. The sequence of the coding sound signal accumulated into the 
buffer 14 is not confused, and insertion and abandonment are outputted in the order inputted although carried 
out. 

[0031] Here, when the clock of a transmitting agency is earlier than the clock of receiving-side equipment, it 
becomes the inclination which the accumulated dose of the buffer section 14 increases, and since the input to 
the buffer section 14 is performed based on the clock of IP packet transmitting origin and the output from the 
buffer section 14 is performed based on the clock of receiving-side equipment, when the clock of a transmitting 
agency is conversely later than the clock of receiving^side equipment, the accumulated dose of the buffer 
section 14 serves as a decreasing inclination. At the amount Monitoring Department 16 of buffers, the input of 
the buffer section 14, an output situation, and insertion and an abandonment situation are supervised, and the 
accumulated dose of the coding sound signal in the buffer section 14 is notified to the buffer control section 15. 
In the decode section 17, the coding sound signal outputted from the buffer section 14 is decoded, and it outputs 
as a sound signal. 

[0032] Actuation of the buffer control section 1 5 is explained Drawing 2 is the flow chart which showed 
actuation of the buffer control section 15. The buffer control section 15 checks the accumulated dose of the 
buffer section 14 based on the information from the amount Monitoring Department 16 of buffers (step SI), and 
judges whether it is below the lower limit the accumulated dose was beforehand decided to be (step S2). 
[0033] The marker which is given to the coding sound signal accumulated into the buffer section 14 with the 
lower limit [ below ] is investigated, the silent section is found, and processing which Increases the accumulated 
dose in a buffer 14 by inserting the non-note number-ized sound signal which directed in the non-note number- 
ized sound signal generation section 1 3, and it was made to generate at the silent section by the first-sound 
voice frame is performed (step S3). If it is not below a lower limit, it will judge whether it is more than the upper 
limit the accumulated dose was beforehand decided to be (step S4). With a upper limit [ more than ], processing 
which reduces the accumulated dose in the buffer section 14 by investigating the marker given to the coding 
sound signal accumulated Into the buffer section 1 4. finding the silent section, and discarding the non-note 



number^ized sound signal of the silent section by the first-sound voice frame is performed (step S5). Processing 
will not be performed if it is not more than a upper limit. 

[0034] In addition, although the coding sound signal was explained to the example here, also including the gestalt 
of the following operations, it does not limit to a coding sound signal and applies also about the sound signal 
which is not encoded. Therefore, as for the coding sound signal of a publication in the gestalt of each operation, 
the sound signal and the non-note number-ized sound signal express that it is a silent sound signal. 
[0035] As mentioned above, by giving the marker in which an owner sound / non-sound is shown to the received 
coding sound signal, accumulating in a buffer temporarily, responding to the accumulated dose of the buffer, and 
inserting / discarding the coding sound signal in a buffer, the difference of the clock of a transmitting agency 
(transmitting-side equipment) and the clock of a transmission place (receiving-side equipment) can be absorbed, 
and a more nearly quality voice speech quality and a highly precise clock difference absorption function can be 
realized cheaply. 

[0036] The gestalt 2 of operation is explained with reference to drawing below gestalt 2. of operation. Drawing 3 
is the block diagram of the voice transmission equipment of the gestalt 2 of operation. In drawing 3 . since the 
same sign as drawing 1 shows the same or a considerable part, it omits explanation. 18 is a silent continuation 
test section which measures the duration of the silent section inputted into the buffer section 14. 
[0037] Next, actuation is explained. The coding sound signal stored in the voice IP packet in the IP packet 
receive section 10 is extracted, and the inputted voice IP packet is outputted to the voice detecting element 1 1 
and the marker grant section 1 2. In the voice detecting element 1 1 . it detects and judges whether an applicable 
coding sound signal is in an owner sound condition, or it is in a silent condition from the voice level of the sound 
signal acquired by the voice coding parameter contained in a coding sound signal, or simple decode processing 
etc.. and the result is outputted to the marker grant section 12. In the marker grant section 12. as opposed to 
the coding sound signal inputted from the IP packet receive section 10 based on the owner sound / silent 
information from the voice detecting element 11. the marker in which it is shown whether it is in an owner sound 
condition as header information of a coding sound signal or it is in a silent condition is given, and it outputs to 
the buffer section 1 4. 

[0038] In the silent continuation test section 18, the silent condition duration of the coding sound signal which 
supervised the marker given to the coding sound signal inputted into the buffer section 14, and was inputted into 
the buffer section 14 is measured, and the result is notified to the buffer control section 15, In the non-note 
number-ized sound signal generation section 13, the non-note number-ized sound signal with which the same 
coding method as the coding sound signal inputted into the buffer section 14 according to the directions from 
the buffer control section 15 was given is generated, and it outputs to the buffer section 14. In the buffer section 
14. after it accumulates the coding sound signal inputted through the marker grant section 12 temporarily and 
abandonment and insertion of a coding sound signal are performed by the buffer control section 15, based on the 
clock of receiving-side equipment, the coding sound signal except marker information is periodically outputted to 
the decode section 17. The sequence of the coding sound signal accumulated into the buffer 14 is not confused, 
and insertion and abandonment are outputted in the order inputted although carried out. 

[0039] Here, when the clock of a transmitting agency is earlier than the clock of receiving-side equipment, it 
becomes the inclination which the accumulated dose of the buffer section 14 increases, and since the input to 
the buffer section 14 is performed based on the clock of IP packet transmitting origin and the output from the 
buffer section 14 is performed based on the clock of receiving-side equipment, when the clock of a transmitting 
agency is conversely later than the clock of receiving-side equipment, the accumulated dose of the buffer 
section 14 serves as a decreasing inclination. 

[0040] At the amount Monitoring Department 16 of buffers, the input of the buffer section 14. an output 
situation, and insertion and an abandonment situation are supervised, and the accumulated dose of the coding 
sound signal in the buffer section 14 is notified to the buffer control section 15. In the decode section 17. the 
coding sound signal outputted from the buffer section 14 is decoded, and it outputs as a sound signal. 
[0041] Actuation of the buffer control section 15 is explained. Drawing 4 is the flow chart which showed 
actuation of the buffer control section 15. The buffer control section 15 checks the accumulated dose of the 
buffer section 14 based on the information from the amount Monitoring Department 16 of buffers (step S1 1), and 
judges whether it is below the lower limit the accumulated dose was beforehand decided to be (step SI 2). 
[0042] With a lower limit [ below ]. the duration of a silent condition is checked based on the information from 
the silent continuation test section 18 (step 813). and it judges whether it is shorter than the threshold the 
silent condition duration was beforehand decided to be (step SI 4). The marker given to the coding sound signal 
accumulated into the buffer section 14 if shorter than a threshold is investigated, the silent section is found, and 
processing which increases the accumulated dose in a buffer 14 by inserting the non-note number-ized sound 
signal which directed in the non-note number-ized sound signal generation section 1 3. and it was made to 
generate at the silent section by N individual voice frame is performed (step SI 5). The marker given to the 
coding sound signal accumulated into the buffer section 14 if not shorter than a threshold is investigated, the 
silent section Is found, and processing which increases the accumulated dose in a buffer 14 by inserting the non- 



note number-ized sound signal which directed in the non-note number-ized sound signal generation section 13. 
and it was made to generate at the silent section by the M piece voice frame is performed (step SI 6). Here. N 
presupposes that it is smaller than M. 

[0043] Moreover, if a buffer accumulated dose is not below a lower limit, it will judge whether it is more than the 
upper limit the accumulated dose was beforehand decided to be (step SI 7). With a upper limit [ more than ]. the 
duration of a silent condition is checked based on the information from the silent continuation test section 18 
(step SI 8), and it judges whether it is shorter than the threshold the silent condition duration was beforehand 
decided to be (step SI 9). If shorter than a threshold, processing which reduces the accumulated dose in the 
buffer section 14 by investigating the marker given to the coding sound signal accumulated into the buffer 
section 14, finding the silent section, and discarding the non-note number-ized sound signal of the silent section 
by X voice frame will be performed (step S20). If not shorter than a threshold, processing which reduces the 
accumulated dose in the buffer section 14 by investigating the marker given to the coding sound signal 
accumulated into the buffer section 1 4, finding the silent section, and discarding the non-note number-ized 
sound signal of the silent section by Y voice frame is performed (step S2lX Here, X presupposes that it is 
smaller than Y. Moreover, processing will not be performed if a buffer accumulated dose is not more than a upper 
limit. 

[0044] As mentioned above, while giving the marker in which an owner sound / non-sound is shown to the 
received coding sound signal, accumulating in a buffer temporarily, responding to the accumulated dose of the 
buffer and inserting / discarding the coding sound signal in a buffer By adjusting the amount which performs 
insertion/abandonment according to the die length of the silent section which performs insertion/abandonment, 
the difference of the clock of a transmitting agency and the clock of receiving-side equipment can be absorbed, 
and a more nearly quality voice speech quality and a highly precise clock difference absorption function can be 
realized cheaply. 

[0045] The gestalt 3 of operation is explained with reference to drawing below gestalt 3. of operation. Drawing 5 
is the block diagram of the voice transmission equipment of the gestalt 3 of operation. In drawing 5 , since the 
same sign as drawingj, shows the same or a considerable part, it omits explanation. 19 is the 2nd marker grant 
section given to the coding sound signal into which the marker in which a front hangover and a hangover are 
shown based on the information from the voice detecting element 1 1 is inputted through the marker grant 
section 12. 

[0046] Next, actuation is explained. The coding sound signal stored in the voice IP packet in the IP packet 
receive section 10 is extracted, and the inputted voice IP packet is outputted to the voice detecting element 1 1 
and the marker grant section 12. In the voice detecting element 11, it detects and judges whether an applicable 
coding sound signal is in an owner sound condition, or it is in a silent condition from the voice level of the sound 
signal acquired by the voice coding parameter contained in a coding sound signal, or simple decode processing 
etc.. and the result is outputted to the marker grant section 12. In the marker grant section 12, as opposed to 
the coding sound signal inputted from the IP packet receive section 10 based on the owner sound / silent 
information from the voice detecting element 1 1, the marker in which it is shown whether it is in an owner sound 
condition as header information of a coding sound signal or it is in a silent condition is given, and it outputs to 
the 2nd marker grant section 1 9. 

[0047] In the 2nd marker grant section 1 9. a certain fixed time amount Saki's part is made into a front hangover 
based on the owner sound / silent information from the voice detecting element 1 1 rather than the time of 
changing from a silent condition to an owner sound condition. The part after a certain fixed time amount is made 
into a hangover from the time of changing from an owner sound condition to a silent condition. The 2nd marker in 
which it is shown whether it is a front hangover part as header information of a coding sound signal like the 
marker grant section 12 or it is a hangover part is given to the coding sound signal inputted through the marker 
grant section 1 2. and it outputs to the buffer section 1 4. 

[0048] Here, a front hangover and a hangover are explained. Drawing 6 is the threshold used for the magnitude of 
a sound signal, an owner sound / silent judgment, and drawing which expressed typically its owner sound / silent 
judging result. Generally, if an owner sound / silent judgment is beyond the threshold as compared with a 
threshold and they is an owner sound and below a threshold, it will make audio magnitude (sound pressure level) 
silent. With actual voice, although it is below a threshold as the standup of a sound, or a process of falling in the 
start of language, or the part of an end. there is a part important as a sound. It is called ****. the initial of the 
word, and and the ending. In drawing 6 . although this part exists as voice, since it is below an owner sound 
threshold, it is equivalent to the shadow area made silent. We decided to distinguish from a silent part by making 
the part after a certain fixed time amount into a hangover from the time of making a certain fixed time amount 
Saki's part into a front hangover from the time of changing to an owner sound, since it was silent so that this 
part might be covered, and changing from an owner sound silently. 

[0049] In the non-note number-ized sound signal generation section 1 3, the non-note number-ized sound signal 
with which the same coding method as the coding sound signal inputted into the buffer section 14 according to 
the directions from the buffer control section 1 5 was given is generated, and it outputs to the buffer section 1 4. 



In the buffer section 14, after it accumulates the coding sound signal inputted through the marker grant section 
1i and the 2nd marker grant section 19 temporarily and abandonment and insertion of a coding sound signal are 
performed by the buffer control section 1 5, based on the clock of receiving-side equipment, the coding sound 
signal except marker information is periodically outputted to the decode section 1 7. The sequence of the coding 
sound signal accumulated into the buffer 14 is not confused, and insertion and abandonment are outputted in the 
order inputted although carried out. 

[0050] Here, when the clock of a transmitting agency is earlier than the clock of receiving-side equipment it 
becomes the inclination which the accumulated dose of the buffer section 14 increases, and since the input to 
the buffer section 14 is performed based on the clock of IP packet transmitting origin and the output from the 
buffer section 14 is performed based on the clock of receiving-side equipment, when the clock of a transmitting 
agency is conversely later than the clock of receiving-side equipment, the accumulated dose of the buffer 
section 14 serves as a decreasing inclination. At the amount Monitoring Department 16 of buffers, the input of 
the buffer section 14. an output situation, and insertion and an abandonment situation are supervised, and the 
accumulated dose of the coding sound signal in the buffer section 14 is notified to the buffer control section 15. 
In the decode section 17. the coding sound signal outputted from the buffer section 14 is decoded, and it outputs 
as a sound signal. 

[0051] Actuation of the buffer control section 15 is explained. Drawing 7 is the flow chart which showed 
actuation of the buffer control section 15. The buffer control section 15 checks the accumulated dose of the 
buffer section 14 based on the information from the amount Monitoring Department 16 of buffers (step S31), and 
judges whether it is below the lower limit the accumulated dose was beforehand decided to be (step S32). 
[0052] The marker which is given to the coding sound signal accumulated into the buffer section 14 with the 
lower limit [ below ] investigates, the silent section which is not the front hangover section, either and is not the 
hangover section, either finds, and the processing which increases the accumulated dose in a buffer 14 by 
inserting the non-note number-ized sound signal which directed in the non-note number-ized sound signal 
generation section 13. and it was made to generate at the silent section by the first-sound voice frame performs 
(step S33). If it is not below a lower limit, it will judge whether it is the 1st more than upper limit the accumulated 
dose was beforehand decided to be (step S34). If it is the 1st more than upper limit, it will judge whether it is the 
2nd more than upper limit the accumulated dose was beforehand decided to be (step S35), If it is the 2nd more 
than upper limit, it will judge whether it is the 3rd more than upper limit the accumulated dose was beforehand 
decided to be (step S37). 

[0053] If it is the 3rd more than upper limit, processing which reduces the accumulated dose in the buffer 
section 14 by discarding the coding sound signal which investigates the marker given to the coding sound signal 
accumulated into the buffer section 14, finds the front hangover section, and is in the front hangover section by 
the first-sound voice frame will be performed (step S39). Supposing it is not the 3rd more than upper limit (i.e., if 
it is the 3rd less than upper limit more than in the 2nd upper limit), processing which reduces the accumulated 
dose in the buffer section 14 by discarding the coding sound signal which investigates the marker given to the 
coding sound signal accumulated into the buffer section 14, finds the hangover section, and is in the hangover 
section by the first-sound voice frame will be performed (step S38). 

[0054] Processing which reduces the accumulated dose in the buffer section 14 by discarding the coding sound 
signal which investigates the marker given to the coding sound signal accumulated into the buffer section 14, 
finds the silent section which is not the front hangover section, either and is not the hangover section, either, 
and is in the silent section by the first-sound voice frame supposing it is not the 2nd more than upper limit (i.e., 
if it is the 2nd less than upper limit more than in the 1st upper limit) is performed (step S36). Processing will not 
be performed if it is not the 1st more than upper limit. 

[0055] As mentioned above, the marker which shows an owner sound / non-sound to the received coding sound 
signal. And give the marker in which the front hangover section and the hangover section are shown, and it 
accumulates in a buffer temporarily. By responding to the classification of the coding sound signal of the owner 
sound / silent / front hangover / hangover in a buffer, and inserting / discarding the coding sound signal in a 
buffer, corresponding to the accumulated dose of the buffer The difference of the clock of a transmitting agency 
and the clock of receiving-side equipment can be absorbed, and a more nearly quality voice speech quality and a 
highly precise clock difference absorption function can be realized cheaply. 

[0056] In addition, discarding the coding sound signal in a buffer enables it to discard few coding sound signals in 
question from the low place of a upper limit more in a coding sound signal according to two or more upper limits 
to the coding sound signal which was explained with the gestalt of this operation and which was accumulated in 
the buffer section, corresponding to the classification of the coding sound signal of the owner sound / silent / 
front hangover / hangover in a buffer, even if it discards. The example of the gestalt of this operation explains 
discarding the coding sound signal of each section in order of the silent section which is not the front hangover 
section with few problems, either, and is not the hangover section, either, even if it discards by the lowest upper 
limit [ 1st / more than ] upper limit more than at the 2nd less than upper limit and the 2nd upper limit more than 
about the 3rd less than upper limit and the 3rd upper limit, the hangover section, and the front hangover section. 



[0057] The gestajt 4 of operation is explained with reference to drawing below gestalt 4. of operation. Drawing 8 
is the block diagram of the voice transmission equipment of the gestalt 4 of operation. In drawin g 8 , since the 
same sign as drawing 1 shows the same or a considerable part, it omits explanation. The received-data 
distinction section which distinguishes whether the coding sound signals from the IP packet receive section 10 
are sound signals, such as the usual conversation, and whether 20 is signals other than sound signals, such as a 
facsimile signal, and 21 are selectors which choose the input from the IP packet receive section 10. and the 
input from the buffer section 1 4 for the output to a decoder 1 7 based on the distinction result of the received- 
data distinction section 20. 

[0058] Next, actuation is explained. The coding sound signal stored in the voice IP packet in the IP packet 
receive section 1 0 is extracted, and the inputted voice IP packet is outputted to the voice detecting element 1 1 
and the marker grant section 12. In the voice detecting element 1 1. it detects and judges whether an applicable 
coding sound signal is in an owner sound condition, or it is in a silent condition from the voice level of the sound 
signal acquired by the voice coding parameter contained in a coding sound signal, or simple decode processing 
etc., and the result is outputted to the marker grant section 12. In the marker grant section 12. as opposed to 
the coding sound signal inputted from the IP packet receive section 10 based on the owner sound / silent 
information from the voice detecting element 1 1 . the marker in which it is shown whether it is in an owner sound 
condition as header information of a coding sound signal or it is in a silent condition is given, and it outputs to 
the buffer section 14. 

[0059] In the received-data distinction section 20, the coding sound signal inputted from the IP packet receive 
section 10 distinguishes whether they are sound signals, such as the usual conversation, and whether they are 
signals other than sound signals, such as a facsimile signal, and outputs the result to a selector 21. In a selector 
21, according to the directions from the received-data distinction section 20, if distinguished from a sound signal, 
the input from a buffer 14 will be chosen and it will output to the decode section 17, and if distinguished from 
signals other than sound signals, such as a facsimile signal, the input from the IP packet receive section 10 will 
be chosen, and it will output to the decode section 1 7. In the non-note number-ized sound signal generation 
section 13, the non-note number-ized sound signal with which the same coding method as the coding sound 
signal inputted into the buffer section 14 according to the directions from the buffer control section 15 was given 
is generated, and it outputs to the buffer section 1 4, 

[0060] In the buffer section 14. after it accumulates the coding sound signal inputted through the marker grant 
section 12 temporarily and abandonment and insertion of a coding sound signal are performed by the buffer 
control section 15. based on the clock of receiving-side equipment, the coding sound signal except marker 
information is periodically outputted to the decode section 1 7. The sequence of the coding sound signal 
accumulated into the buffer 1 4 is not confused, and insertion and abandonment are outputted in the order 
inputted although carried out. 

[0061] Here, when the clock of a transmitting agency is earlier than the clock of receiving-side equipment, it 
becomes the inclination which the accumulated dose of the buffer section 14 increases, and since the input to 
the buffer section 14 is performed based on the clock of IP packet transmitting origin and the output from the 
buffer section 14 is performed based on the clock of receiving-side equipment, when the clock of a transmitting 
agency is conversely later than the clock of receiving-side equipment, the accumulated dose of the buffer 
section 14 serves as a decreasing inclination. At the amount Monitoring Department 16 of buffers, the input of 
the buffer section 14, an output situation, and insertion and an abandonment situation are supervised, and the 
accumulated dose of the coding sound signal in the buffer section 14 is notified to the buffer control section 15. 
In the decode section 17, the coding sound signal outputted from the buffer section 14 is decoded, and it outputs 
as a sound signal. About actuation of the buffer control section 15. since it is equivalent to what was explained 
using drawing 2 with the gestalt 1 of operation, explanation is omitted. 

[0062] As mentioned above, by giving the marker in which an owner sound / non-sound is shown to the received 
coding sound signal, accumulating in a buffer temporarily, responding to the accumulated dose of the buffer, and 
inserting / discarding the coding sound signal in a buffer, the difference of the clock of a transmitting agency and 
the clock of receiving-side equipment can be absorbed, and a more nearly quality voice speech quality and a 
highly precise clock difference absorption function can be realized cheaply. Moreover, the same processing as a 
sound signal may be unsuitable, is distinguishing except [ its ] from a sound signal, and can provide facsimile 
signals other than sound signals, such as the usual conversation, with the total Takamichi talk quality. 
[0063] The gestalt 5 of operation is explained with reference to drawing below gestalt 5. of operation. Drawing 9 
is the block diagram of the voice transmission equipment of the gestalt 5 of operation. In drawin g 9 , since the 
same sign as drawing 8 shows the same or a considerable part, it omits explanation. 22 is the facsimile protocol 
analysis section which analyzes the protocol, if the coding sound signal from the IP packet receive section 10 
judges whether it is a facsimile signal and is a facsimile signal. 

[0064] Next, actuation is explained. The coding sound signal stored in the voice IP packet in the IP packet 
receive section 10 is extracted, and the inputted voice IP packet is outputted to the voice detecting element 1 1 
and the marker grant section 12. In the voice detecting element 1 1. it detects and judges whether an applicable 



coding" sound signal is in an owner sound condition, or it is in a silent condition from the voice level of the sound 
signal acquired by the voice coding parameter contained in a coding sound signal, or simple decode processing 
etc.. and the result is outputted to the marker grant section 12. In the marker grant section 12. as opposed to 
the coding sound signal inputted from the IP packet receive section 10 based on the owner sound / silent 
information from the voice detecting element 11, the marker in which it is shown whether it is in an owner sound 
condition as header information of a coding sound signal or it is in a silent condition is given, and it outputs to 
the buffer section 14. 

[0065] In the received-data distinction section 20. the coding sound signal inputted from the IP packet receive 
section 10 distinguishes whether they are sound signals, such as the usual conversation, and whether they are 
signals other than sound signals, such as a facsimile signal, and outputs the result to the facsimile protocol 
analysis section 22. In the facsimile protocol analysis section 22, if it is except a sound signal, simple decode of 
the coding sound signal from the IP packet receive section 10 will be performed, based on the distinction result 
of the received-data distinction section 20, it judges whether it is a facsimile signal, and in being a facsimile 
signal, the protocol of the facsimile signal is analyzed and it notifies the buffer Monitoring Department 15. 
[0066] In the non-note number-ized sound signal generation section 1 3, the non-note number-ized sound signal 
with which the same coding method as the coding sound signal inputted into the buffer section 14 according to 
the directions from the buffer control section 15 was given is generated, and it outputs to the buffer section 14. 
In the buffer section 14, after it accumulates the coding sound signal inputted through the marker grant section 
1 2 temporarily and abandonment and insertion of a coding sound signal are performed by the buffer control 
section 15, based on the clock of receiving-side equipment, the coding sound signal except marker information is 
periodically outputted to the decode section 1 7. The sequence of the coding sound signal accumulated into the 
buffer 1 4 is not confused, and insertion and abandonment are outputted in the order inputted although carried 
out. 

[0067] Here, when the clock of a transmitting agency is earlier than the clock of receiving-side equipment, it 
becomes the inclination which the accumulated dose of the buffer section 14 increases, and since the input to 
the buffer section 14 is performed based on the clock of IP packet transmitting origin and the output from the 
buffer section 14 is performed based on the clock of receiving-side equipment, when the clock of a transmitting 
agency is conversely later than the clock of receiving-side equipment, the accumulated dose of the buffer 
section 14 serves as a decreasing inclination. At the amount Monitoring Department 16 of buffers, the input of 
the buffer section 14, an output situation, and insertion and an abandonment situation are supervised, and the 
accumulated dose of the coding sound signal in the buffer section 14 is notified to the buffer control section 15. 
In the decode section 1 7, the coding sound signal outputted from the buffer section 1 4 is decoded, and it outputs 
as a sound signal. i 

[0068] The silent section between for example, a DCS signal and an EPT signal is finely specified in the facsimile 
signal as 20 - 25 m seconds between the EPT signal and the training signal for 75**20 m seconds. Therefore, in 
case insertion/abandonment is controlled to the silent part of the coding sound signal in the buffer section 14, 
when it is a facsimile signal, avoiding and processing is desirable [ the critical section ] on the above protocols. 
[0069] When the coding sound signal accumulated in the buffer section 14 temporarily based on the information 
from the facsimile protocol analysis section 22 based on this is a facsimile signal, even if it performs 
insertion/abandonment, it controls by the buffer control section 15 on a facsimile protocol to process to the 
satisfactory silent section. About actuation of the buffer control sections 15 other than this, since it is 
equivalent to what was explained using drawing 2 with the gestalt 1 of operation, explanation is omitted. 
[0070] As mentioned above, by giving the marker in which an owner sound / non-sound is shown to the received 
coding sound signal, accumulating in a buffer temporarily, responding to the accumulated dose of the buffer, and 
inserting / discarding the coding sound signal in a buffer, the difference of the clock of a transmitting agency and 
the clock of receiving-side equipment can be absorbed, and a more nearly quality voice speech quality and a 
highly precise clock difference absorption function can be realized cheaply. Moreover, facsimile signals other 
than sound signals, such as the usual conversation, can be provided with the total Takamichi talk quality by 
performing control also in consideration of a facsimile protocol. 

[0071] The gestalt 6 of operation is explained with reference to drawing below gestalt 6. of operation. Drawing 10 
is the block diagram of the voice transmission equipment of the gestalt 6 of operation. The IP packet receive 
section which extracts a coding sound signal from the voice IP packet which 10 received in drawin g 10 , The 
voice detecting element which detects and judges the owner sound / silent condition of a sound signal that 1 1 
was outputted from the decode section 1 7, The marker grant section which 1 2 gives the marker in which an 
owner sound / non-sound is shown to the sound signal from the decode section 1 7 based on the information 
from the voice detecting element 11, The silent sound signal generation section which 23 generates a silent 
sound signal with the directions from the buffer control section 1 5. and is outputted. The buffer section which 
accumulates the sound signal into which 14 is inputted through the marker grant section 12 temporarily. 15 
inserts the silent sound signal from the silent sound signal generation section 23 to the sound signal accumulated 
into the buffer section 14 temporarily based on the information from the amount Monitoring Department 16 of 



buffers.. And the .buffer control section which discards the sound signal in the buffer section 14. the amount 
Monitoring Department of buffers where 16 supervises the accumulated dose of the sound signal in a buffer, and 
1 7 are the decode sections which decode the coding sound signal outputted from the IP packet receive section 
10. 

[0072] Next, actuation is explained. The coding sound signal stored in the voice IP packet in the IP packet 
receive section 10 is extracted, and the inputted voice IP packet is outputted to the decode section 17. In the 
decode section 17, the coding sound signal outputted from the IP packet receive section 10 is decoded, and it 
outputs as a sound signal. In the voice detecting element 1 1. it detects and judges whether an applicable sound 
signal is in an owner sound condition, or it is in a silent condition from the voice level of the sound signal from 
the decode section 17 etc.. and the result is outputted to the marker grant section 12. In the marker grant 
section 12. as opposed to the sound signal inputted from the decode section 17 based on the owner sound / 
silent information from the voice detecting element 1 1 , if it is the information whose sound signal is 8 bits, the 
marker in which it is shown whether it is in an owner sound condition as the bit [ 9th ] information or it is in a 
silent condition will be given, and it will output to the buffer section 14. 

[0073] In the silent sound signal generation section 23, the silent sound signal of the same format as the sound 
signal inputted into the buffer section 14 according to the directions from the buffer control section 15 is 
generated, and it outputs to the buffer section 14. In the buffer section 14. after it accumulates the sound signal 
inputted through the marker grant section 12 temporarily and abandonment and insertion of a sound signal are 
performed by the buffer control section 1 5, based on the clock of receiving-side equipment, the sound signal 
except marker information is outputted periodically. The sequence of the sound signal accumulated into the 
buffer 14 is not confused, and insertion and abandonment are outputted in the order inputted although carried 
out. 

[0074] Here, when the clock of a transmitting agency is earlier than the clock of receiving-side equipment, it 
becomes the inclination which the accumulated dose of the buffer section 14 increases, and since the input to 
the buffer section 14 is performed based on the clock of IP packet transmitting origin and the output from the 
buffer section 14 is performed based on the clock of receiving-side equipment, when the clock of a transmitting 
agency is conversely later than the clock of receiving-side equipment, the accumulated dose of the buffer 
section 14 serves as a decreasing inclination. At the amount Monitoring Department 16 of buffers, the input of 
the buffer section 14, an output situation, and insertion and an abandonment situation are supervised, and the 
accumulated dose of the sound signal in the buffer section 14 is notified to the buffer control section 15. 
[0075] Actuation of the buffer control section 15 is explained. Drawing 1 1 is the flow chart which showed 
actuation of the buffer control section 15. The buffer control section 15 checks the accumulated dose of the 
buffer section 14 based on the information from the amount Monitoring Department 16 of buffers (step S41), and 
judges whether it is below the lower limit the accumulated dose was beforehand decided to be (step S42). 
[0076] The marker which is given to the sound signal accumulated into the buffer section 14 with the lower limit 
[ below ] is investigated, the silent section is found, and processing which increases the accumulated dose in a 
buffer 14 by inserting the silent sound signal which directed in the silent sound signal generation section 23. and 
it was made to generate at the silent section by the first-sound voice sample is performed (step S43). If it is not 
below a lower limit, it will judge whether it is more than the upper limit the accumulated dose was beforehand 
decided to be (step S44). With a upper limit [ more than ], processing which reduces the accumulated dose in the 
buffer section 14 by investigating the marker given to the sound signal accumulated into the buffer section 14. 
finding the silent section, and discarding the silent sound signal of the silent section by the first-sound voice 
sample is performed (step S45). Processing will not be performed if it is not more than a upper limit. 
[0077] Moreover, another actuation of the buffer control section 15 is explained Drawin g 12 is the flow chart 
which showed actuation of the buffer control section 1 5. The buffer control section 1 5 checks the accumulated 
dose of the buffer section 14 based on the information from the amount Monitoring Department 16 of buffers 
(step S51), and judges whether it is the 1st less than lower limit the accumulated dose was beforehand decided 
to be (step S52). If it is the 1st less than lower limit, it will judge further whether it is the 2nd less than lower 
limit the accumulated dose was beforehand decided to be (step S53). Here, the 1 st lower limit is larger than the 
2nd lower limit. If it is the 2nd less than lower limit, it will judge whether the marker given to the sound signal 
accumulated in the buffer section 1 4 is investigated, and there is any silent section (step S55). 
[0078] If there is the silent section, processing which increases the accumulated dose in a buffer 14 by inserting 
the silent sound signal which directed in the silent sound signal generation section 23, and it was made to 
generate at the silent section of the sound signal accumulated into the buffer section 14 by the first-sound 
voice sample will be performed (step S56). If there is no silent section, interpolation processing will be performed 
to the sound signal accumulated in the buffer section 14. and processing which increases the accumulated dose 
in a buffer 14 by generating and inserting a first-sound voice sample will be performed (step S57). Here, 
interpolation processing shows the processing with which the signal lack part in the owner sound section of a 
sound signal is compensated from the signal state before and behind this signal lack part. 
[0079] If it is not the 2nd less than lower limit, it will Judge whether the marker given to the sound signal 



accumulated in the buffer section 14 is investigated, and there is any silent section (step S54). If there is the 
silent section, processing which increases the accumulated dose in a buffer 14 by inserting the silent sound 
signal which directed in the silent sound signal generation section 23. and it was made to generate at the silent 
section of the sound signal accumulated into the buffer section 14 like the above by the first-sound voice 
sample will be performed (step S56). Processing is not performed without the silent section. If it is not the 1 st 
less than lower limit, it will judge whether it is the 1st more than upper limit the accumulated dose was 
beforehand decided to be (step S58). If it is the 1st more than upper limit, it will judge further whether it is the 
2nd more than upper limit the accumulated dose was beforehand decided to be (step S59). Here, the 1st upper 
limit is smaller than the 2nd upper limit. 

[0080] If it is the 2nd more than upper limit, it will judge whether the marker given to the sound signal 
accumulated in the buffer section 14 is investigated, and there is any silent section (step S61). If there is the 
silent section, processing which reduces the accumulated dose in the buffer section 14 by discarding the silent 
sound signal accumulated into the buffer section 14 by the first-sound voice sample will be performed (step 
S62). If there is no silent section, processing which reduces the accumulated dose in a buffer 14 by performing 
infanticide processing to the sound signal accumulated in the buffer section 14. and discarding a first-sound 
voice sample will be performed (step S63). 

[0081] If it is not the 2nd more than upper limit, it will judge whether the marker given to the sound signal 
accumulated in the buffer section 14 is investigated, and there is any silent section (step S60). If there is the 
silent section, processing which reduces the accumulated dose in the buffer section 14 by discarding the silent 
sound signal accumulated into the buffer section 14 like the above by the first-sound voice sample will be 
performed (step S62). Processing is not performed without the silent section. Processing will not be performed if 
it is not the 1st more than upper limit. 

[0082] As mentioned above, by giving the marker in which an owner sound / non-sound is shown to the sound 
signal received and decoded, accumulating in a buffer temporarily, responding to the accumulated dose of the 
buffer, and inserting / discarding the sound signal in a buffer, the difference of the clock of a transmitting agency 
and the clock of receiving-side equipment can be absorbed, and a more nearly quality voice speech quality and a 
highly precise clock difference absorption function can be realized cheaply. Moreover, by processing to a decode 
sound signal, rather than the case where it processes per voice frame with a certain time amount length, fine 
control is attained and the further quality voice speech quality and a highly precise clock difference absorption 
function can be realized cheaply. 

[0083] The gestalt 7 of operation is explained with reference to drawing below gestalt 7. of operation. Drawing 13 
is the block diagram of the voice transmission equipment of the gestalt 7 of operation. In drawing 13 , since the 
same sign as drawing 10 shows the same or a considerable part, it omits explanation. 18 is a silent continuation 
test section which measures the duration of the silent section inputted into the buffer section 14. 
[0084] Next, actuation is explained. The coding sound signal stored in the voice IP packet in the IP packet 
receive section 1 0 is extracted, and the inputted voice IP packet is outputted to the decode section 1 7. In the 
decode section 17, the coding sound signal outputted from the IP packet receive section 10 is decoded, and it 
outputs as a sound signal. In the voice detecting element 1 1. it detects and judges whether an applicable sound 
signal is in an owner sound condition, or it is in a silent condition from the voice level of the sound signal from 
the decode section 17 etc.. and the result is outputted to the marker grant section 12. In the marker grant 
section 12, as opposed to the sound signal inputted from the decode section 17 based on the owner sound / 
silent information from the voice detecting element 1 1, if it is the information whose sound signal is 8 bits, the 
marker in which it is shown whether it is in an owner sound condition as the bit [ 9th ] information or it is in a 
silent condition will be given, and it will output to the buffer section 14. 

[0085] In the silent continuation test section 18. the silent condition duration of the sound signal which 
supervised the marker given to the sound signal inputted into the buffer section 14, and was inputted into the 
buffer section 14 is measured, and the result is notified to the buffer control section 15. In the silent sound 
signal generation section 23. the silent sound signal of the same format as the sound signal inputted into the 
buffer section 14 according to the directions from the buffer control section 15 is generated, and it outputs to 
the buffer section 14. In the buffer section 14. after it accumulates the sound signal inputted through the marker 
grant section 12 temporarily and abandonment and insertion of a sound signal are performed by the buffer 
control section 15. based on the clock of receiving-side equipment, the sound signal except marker information 
is outputted periodically. The sequence of the sound signal accumulated into the buffer 1 4 is not confused, and 
insertion and abandonment are outputted in the order inputted although carried out. 

[0086] Here, when the clock of a transmitting agency is earlier than the clock of receiving-side equipment, it 
becomes the inclination which the accumulated dose of the buffer section 14 increases, and since the input to 
the buffer section 14 is performed based on the clock of IP packet transmitting origin and the output from the 
buffer section 14 is performed based on the clock of receiving-side equipment, when the clock of a transmitting 
agency is conversely later than the clock of receiving-"side equipment the accumulated dose of the buffer 
section 14 serves as a decreasing inclination. At the amount Monitoring Department 16 of buffers, the input of 



the buffer section 14. an output situation, and insertion and an abandonment situation are supervised, and the 
accumulated dose of the sound signal in the buffer section 1 4 is notified to the buffer control section 1 5. 
[0087] Actuation of the buffer control section 15 is explained. Drawing 1 4 is the flow chart which showed 
actuation of the buffer control section 1 5. The buffer control section 1 5 checks the accumulated dose of the 
buffer section 14 based on the information from the amount Monitoring Department 16 of buffers (step 871). and 
judges whether it is below the lower limit the accumulated dose was beforehand decided to be (step S72). 
[0088] With a lower limit [ below ], the duration of a silent condition is checked based on the information from 
the silent continuation test section 18 (step 873), and it judges whether it is shorter than the threshold the 
silent condition duration was beforehand decided to be (step 874). The marker given to the sound signal 
accumulated into the buffer section 14 if shorter than a threshold is investigated, the silent section is found, and 
processing which increases the accumulated dose in a buffer 14 by inserting the silent sound signal which 
directed in the silent sound signal generation section 23, and it was made to generate at the silent section by N 
individual voice sample is performed (step 875). The marker given to the sound signal accumulated into the 
buffer section 14 if not shorter than a threshold is investigated, the silent section is found, and processing which 
increases the accumulated dose in a buffer 14 by inserting the silent sound signal which directed in the silent 
sound signal generation section 23. and it was made to generate at the silent section by the M piece voice 
sample is performed (step 876). Here, N presupposes that it is smaller than M. 

[0089] Moreover, if a buffer accumulated dose is not below a lower limit, it will judge whether it is more than the 
upper limit the accumulated dose was beforehand decided to be (step 877). With a upper limit [ more than ]. the 
duration of a silent condition is checked based on the information from the silent continuation test section 18 
(step 878), and it judges whether it is shorter than the threshold the silent condition duration was beforehand 
decided to be (step 879). If shorter than a threshold, processing which reduces the accumulated dose in the 
buffer section 14 by investigating the marker given to the sound signal accumulated into the buffer section 14, 
finding the silent section, and discarding the silent sound signal of the silent section by X voice sample will be 
performed (step 880). If not shorter than a threshold, processing which reduces the accumulated dose in the 
buffer section 14 by investigating the marker given to the sound signal accumulated into the buffer section 14, 
finding the silent section, and discarding the silent sound signal of the silent section by Y voice sample is 
performed (step 881). Here, X presupposes that it is smaller than Y. Moreover, processing will not be performed 
if a buffer accumulated dose is not more than a upper limit. 

[0090] As mentioned above, while giving the marker in which an owner sound / non-sound is shown to the sound 
signal received and decoded, accumulating in a buffer temporarily, responding to the accumulated dose of the 
buffer and inserting / discarding the sound signal in a buffer By adjusting the amount which performs 
insertion/ abandonment according to the die length of the silent section which performs insertion/abandonment, 
the difference of the clock of a transmitting agency and the clock of receiving-side equipment can be absorbed, 
and a more nearly quality voice speech quality and a highly precise clock difference absorption function can be 
realized cheaply. Moreover, by processing to a decode sound signal, rather than the case where it processes per 
voice frame with a certain time amount length, fine control is attained and the further quality voice speech 
quality and a highly precise clock difference absorption function can be realized cheaply. 

[0091] The gestalt 8 of operation is explained with reference to drawing below gestalt 8. of operation. Drawing 1 5 
is the block diagram of the voice transmission equipment of the gestalt 8 of operation. In draw ing 1 5 , since the 
same sign as drawing 1 0 shows the same or a considerable part, it omits explanation. 19 is the 2nd marker grant 
section given to the coding sound signal into which the marker in which a front hangover and a hangover are 
shown based on the information from the voice detecting element 1 1 is inputted through the marker grant 
section 12. 

[0092] Next, actuation is explained. The coding sound signal stored in the voice IP packet in the IP packet 
receive section 10 is extracted, and the inputted voice IP packet is outputted to the decode section 17. In the 
decode section 1 7, the coding sound signal outputted from the IP packet receive section 1 0 is decoded, and it 
outputs as a sound signal. In the voice detecting element 1 1, it detects and judges whether an applicable sound 
signal is in an owner sound condition, or it is in a silent condition from the voice level of the sound signal from 
the decode section 17 etc.. and the result is outputted to the marker grant section 12. In the marker grant 
section 12, as opposed to the sound signal inputted from the decode section 17 based on the owner sound / 
silent information from the voice detecting element 1 1. if it is the information whose sound signal is 8 bits, the 
marker in which it is shown whether it is in an owner sound condition as the bit [ 9th ] information or it is in a 
silent condition will be given, and it wilt output to the 2nd marker grant section 19. 

[0093] In the 2nd marker grant section 19. a certain fixed time amount 8aki s part is made into a front hangover 
based on the owner sound / silent information from the voice detecting element 1 1 rather than the time of 
changing from a silent condition to an owner sound condition. The part after a certain fixed time amount is made 
into a hangover from the time of changing from an owner sound condition to a silent condition. As opposed to 
the sound signal inputted through the marker grant section 12 If the 9th bit is the marker information on the 
marker grant section 12 for the information whose sound signal is 8 bits like the marker grant section 12, as the 



bit [ 10th ] information The 2nd marker in which it is shown whether it is a front hangover part or it is a hangover 
part is given, and it outputs to the buffer section 1 4. 

[0094] In the silent sound signal generation section 23, the silent sound signal of the same format as the sound 
signal inputted into the buffer section 14 according to the directions from the buffer control section 15 is 
generated, and it outputs to the buffer section 14. In the buffer section 14. after itiaccumulates the sound signal 
inputted through the marker grant section 12 and the 2nd marker grant section 19 temporarily and abandonment 
and insertion of a sound signal are performed by the buffer control section 1 5, based on the clock of receiving- 
side equipment, the sound signal except marker information is outputted periodically. The sequence of the sound 
signal accumulated into the buffer 14 is not confused, and insertion and abandonment are outputted in the order 
inputted although carried out. 

[0095] Here, when the clock of a transmitting agency is earlier than the clock of receiving-side equipment, it 
becomes the inclination which the accumulated dose of the buffer section 14 increases, and since the input to 
the buffer section 14 is performed based on the clock of IP packet transmitting origin and the output from the 
buffer section 14 is performed based on the clock of receiving-side equipment, when the clock of a transmitting 
agency is conversely later than the clock of receiving-side equipment, the accumulated dose of the buffer 
section 14 serves as a decreasing inclination. At the amount Monitoring Department 16 of buffers, the input of 
the buffer section 14. an output situation, and insertion and an abandonment situation are supervised, and the 
accumulated dose of the sound signal in the buffer section 14 is notified to the buffer control section 15. 
[0096] Actuation of the buffer control section 15 is explained. Drawing 16 is the flow chart which showed 
actuation of the buffer control section 15. The buffer control section 15 checks the accumulated dose of the 
buffer section 14 based on the information from the amount IVIonitoring Department 16 of buffers (step S91), and 
judges whether it is below the lower limit the accumulated dose was beforehand decided to be (step S92). 
[0097] The marker which is given to the sound signal accumulated into the buffer section 14 with the lower limit 
[ below ] is investigated, the silent section which is not the front hangover section, either and is not the 
hangover section, either is found, and processing which increases the accumulated dose in a buffer 14 by 
inserting the silent sound signal which directed in the silent sound signal generation section 23. and it was made 
to generate at the silent section by the first-sound voice sample is performed (step S93). If it is not below a 
lower limit, it will judge whether it is the 1st more than upper limit the accumulated dose was beforehand decided 
to be (step S94X If it is the 1st more than upper limit, it will judge whether it is the 2nd more than upper limit the 
accumulated dose was beforehand decided to be (step S95). If it is the 2nd more than upper limit, it will judge 
whether it is the 3rd more than upper limit the accumulated dose was beforehand decided to be (step S97). 
[0098] If it is the 3rd more than upper limit, processing which reduces the accumulated dose in the buffer 
section 14 by discarding the sound signal which investigates the marker given to the sound signal accumulated 
into the buffer section 14, finds the front hangover section, and is in the front hangover section by the first- 
sound voice sample will be performed (step S99). If it is not the 3rd more than upper limit, processing which 
reduces the accumulated dose in the buffer section 14 by discarding the sound signal which investigates the 
marker given to the sound signal accumulated into the buffer section 14, finds the hangover section, and is in the 
hangover section by the first-sound voice sample will be performed (step S98). 

[0099] Processing which reduces the accumulated dose in the buffer section 14 by discarding the sound signal 
which investigates the marker given to the sound signal accumulated into the buffer section 14. finds the silent 
section which is not the front hangover section, either and is not the hangover section, either, and is in the silent 
section by the first-sound voice sample if it is not the 2nd more than upper limit is performed (step S96). 
Processing will not be performed if it is not the 1 st more than upper limit. 

[0100] As mentioned above, the marker which shows an owner sound / non-sound to the sound signal received 
and decoded. And give the marker in which the front hangover section and the hangover section are shown, and 
it accumulates in a buffer temporarily. By responding to the classification of the sound signal of the owner 
sound / silent / front hangover / hangover in a buffer, and inserting / discarding the sound signal in a buffer, 
corresponding to the accumulated dose of the buffer The difference of the clock of a transmitting agency and 
the clock of receiving-side equipment can be absorbed, and a more nearly quality voice speech quality and a 
highly precise clock difference absorption function can be realized cheaply. Moreover, by processing to a decode 
sound signal, rather than the case where it processes per voice frame with a certain time amount length, fine 
control is attained and the further quality voice speech quality and a highly precise clock difference absorption 
function can be realized cheaply. 

[0101] The gestalt 9 of operation is explained with reference to drawing below gestalt 9. of operation. Drawing 1 7 
is the block diagram of the voice transmission equipment of the gestalt 9 of operation. In drawing 1 7 . since the 
same sign as drawing 10 shows the same or a considerable part, it omits explanation. The received-data 
distinction section which distinguishes whether the sound signals from the decode section 1 7 are sound signals, 
such as the usual conversation, and whether 20 is signals other than sound signals, such as a facsimile signal, 
and 21 are selectors which choose the input from the decode section 17, and the input from the buffer section 
14 for an output based on the distinction result of the received-data distinction section 20. 



[0102]'lilext. actyation is explained. The coding sound signal stored in the voice IP packet in the IP packet 
receive section 10 is extracted, and the inputted voice IP packet is outputted to the decode section 17. In the 
decode section 17, the coding sound signal outputted IP packet receive section 10 is decoded, and it outputs as 
a sound signal. In the voice detecting element 1 1, it detects and judges whether an applicable sound signal is in 
an owner sound condition, or it is in a silent condition from the voice level of the sound signal from the decode 
section 17 etc., and the result is outputted to the marker grant section 12. In the marker grant section 12, as 
opposed to the sound signal inputted from the decode section 17 based on the owner sound / silent information 
from the voice detecting element 1 1, if it is the information whose sound signal is 8 bits, the marker in which it is 
shown whether it is in an owner sound condition as the bit [ 9th ] information or it is in a silent condition will be 
given, and it will output to the buffer section 14. 

[0103] In the received-data distinction section 20. the sound signal inputted from the decode section 17 
distinguishes whether they are sound signals, such as the usual conversation, and whether they are signals other 
than sound signals, such as a facsimile signal, and outputs the result to a selector 21. In a selector 21, according 
to the directions from the received-data distinction section 20, if distinguished from a sound signal, the input 
from a buffer 14 will be chosen and outputted, and if distinguished from signals other than sound signals, such as 
a facsimile signal, the input from the decode section 1 7 will be chosen and outputted. In the silent sound signal 
generation section 23, the silent sound signal of the same format as the sound signal inputted into the buffer 
section 14 according to the directions from the buffer control section 15 is generated, and it outputs to the 
buffer section 1 4. 

[0104] In the buffer section 14. after it accumulates the sound signal inputted through the marker grant section 
12 temporarily and abandonment and insertion of a sound signal are performed by the buffer control section 15, 
based on the clock of receiving^side equipment, the sound signal except marker information is periodically 
outputted to a selector 21. The sequence of the sound signal accumulated into the buffer 14 is not confused, 
and insertion and abandonment are outputted in the order inputted although carried out. 
[0105] Here, when the clock of a transmitting agency is earlier than the clock of receiving-side equipment, it 
becomes the inclination which the accumulated dose of the buffer section 14 increases, and since the input to 
the buffer section 14 is performed based on the clock of IP packet transmitting origin and the output from the 
buffer section 14 is performed based on the clock of receiving-side equipment, when the clock of a transmitting 
agency is conversely later than the clock of receiving-side equipment, the accumulated dose of the buffer 
section 14 serves as a decreasing inclination. At the amount Monitoring Department 16 of buffers, the input of 
the buffer section 14, an output situation, and insertion and an abandonment situation are supervised, and the 
accumulated dose of the sound signal in the buffer section 14 is notified to the buffer control section 15. About 
actuation of the buffer control section 15, since it is equivalent to what was explained using drawing 1 1 with the 
gestalt 6 of operation, explanation is omitted, 

[0106] As mentioned above, by giving the marker in which an owner sound / non-sound is shown to the sound 
signal received and decoded, accumulating in a buffer temporarily, responding to the accumulated dose of the 
buffer, and inserting / discarding the sound signal in a buffer, the difference of the clock of a transmitting agency 
and the clock of receiving-side equipment can be absorbed, and a more nearly quality voice speech quality and a 
highly precise clock difference absorption function can be realized cheaply. Moreover, the same processing as a 
sound signal may be unsuitable, is distinguishing except [ its ] from a sound signal, and can provide facsimile 
signals other than sound signals, such as the usual conversation, with the total Takamichi talk quality. Moreover, 
by processing to a decode sound signal, rather than the case where it processes per voice frame with a certain 
time amount length, fine control is attained and the further quality voice speech quality and a highly precise 
clock difference absorption function can be realized cheaply. 

[0107] The gestalt 10 of operation is explained with reference to drawing below gestalt 10. of operation. Drawing 
18 is the block diagram of the voice transmission equipment of the gestalt 10 of operation. In drawing 18 . since 
the same sign as drawing 1 7 shows the same or a considerable part, it omits explanation. 22 is the facsimile 
protocol analysis section which analyzes the protocol, if the sound signal from the decode section 17 judges 
whether it is a facsimile signal and is a facsimile signal. 

[0108] Next, actuation is explained. The coding sound signal stored in the voice IP packet in the IP packet 
receive section 10 is extracted, and the inputted voice IP packet is outputted to the decode section 17. In the 
decode section 17, the coding sound signal outputted from the IP packet receive section 10 is decoded, and it 
outputs as a sound signal. In the voice detecting element 1 1, it detects and judges whether an applicable sound 
signal is in an owner sound condition, or it is in a silent condition from the voice level of the sound signal from 
the decode section 17 etc., and the result is outputted to the marker grant section 12. In the marker grant 
section 12, as opposed to the sound signal inputted from the decode section 17 based on the owner sound / 
silent information from the voice detecting element 1 1. if it is the information whose sound signal is 8 bits, the 
marker in which it is shown whether it is in an owner sound condition as the bit [ 9th ] information or it is in a 
silent condition will be given, and it will output to the buffer section 14. 

[0109] In the received-data distinction section 20. the sound signal inputted from the decode section 17 



distinguishes whether they are sound signals, such as the usual conversation, and whether they are signals other 
than sound signals, such as a facsimile signal, and outputs the result to the facsimile protocol analysis section 
22. In the facsimile protocol analysis section 22, based on the distinction result of the received-data distinction 
section 20. if it is except a sound signal, the sound signal from the decode section 17 will judge whether it is a 
facsimile signal, and in being a facsimile signal, the protocol of the facsimile signal is analyzed and it notifies the 
buffer Monitoring Department 1 5. 

[01 10] In the silent sound signal generation section 23. the silent sound signal of the same format as the sound 
signal inputted into the buffer section 14 according to the directions from the buffer control section 15 is 
generated, and it outputs to the buffer section 14. In the buffer section 14, after it accumulates the sound signal 
inputted through the marker grant section 12 temporarily and abandonment and insertion of a sound signal are 
performed by the buffer control section 1 5. based on the clock of receiving^side equipment, the sound signal 
except marker information is outputted periodically. The sequence of the coding sound signal accumulated into 
the buffer 14 is not confused, and insertion and abandonment are outputted in the order inputted although 
carried out. 

[01 1 1] Here, when the clock of a transmitting agency is earlier than the clock of receiving-side equipment, it 
becomes the inclination which the accumulated dose of the buffer section 14 increases, and since the input to 
the buffer section 14 is performed based on the clock of IP packet transmitting origin and the output from the 
buffer section 14 is performed based on the clock of receiving-side equipment, when the clock of a transmitting 
agency is conversely later than the clock of receiving-side equipment, the accumulated dose of the buffer 
section 14 serves as a decreasing inclination. At the amount Monitoring Department 16 of buffers, the input of 
the buffer section 14, an output situation, and insertion and an abandonment situation are supervised, and the 
accumulated dose of the sound signal in the buffer section 1 4 is notified to the buffer control section 1 5. 
[01 12] The silent section between for example, a DCS signal and an EPT signal is finely specified in the facsimile 
signal as 20 - 25 m seconds between the EPT signal and the training signal for 75**20 m seconds. Therefore, in 
case insertion/abandonment is controlled to the silent part of the sound signal in the buffer section 14, when it 
is a facsimile signal, avoiding and processing is desirable [ the critical section ] on the above protocols. When the 
sound signal accumulated in the buffer section 14 temporarily based on the information from the facsimile 
protocol analysis section 22 based on this is a facsimile signal, even if it performs insertion/abandonment, it 
controls by the buffer control section 15 on a facsimile protocol to process to the satisfactory silent section. 
About actuation of the buffer control sections 1 5 other than this, since it is equivalent to what was explained 
using drawing 1 1 with the gestalt 6 of operation, explanation is omitted. 

[01 13] As mentioned above, by giving the marker in which an owner sound / non-sound is shown to the sound 
signal received and decoded, accumulating in a buffer temporarily, responding to the accumulated dose of the 
buffer, and inserting / discarding the sound signal in a buffer, the difference of the clock of a transmitting agency 
and the clock of receiving-side equipment can be absorbed, and a more nearly quality voice speech quality and a 
highly precise clock difference absorption function can be realized cheaply. Moreover, facsimile signals other 
than sound signals, such as the usual conversation, can be provided with the total Takamichi talk quality by 
performing control also in consideration of a facsimile protocol. Moreover, by processing to a decode sound 
signal, rather than the case where it processes per voice frame with a certain time amount length, fine control is 
attained and the further quality voice speech quality and a highly precise clock difference absorption function 
can be realized cheaply. 
[0114] 

[Effect of the Invention] Since this invention is constituted as explained above, it does effectiveness as taken 
below so. 

[01 15] In invention of the 1-3rd **, since the difference of the clock of a transmitting agency and the clock of a 
transmission place is absorbable whether a sound signal is inserted in this accumulated sound signal, and by 
discarding based on the accumulated dose of the sound signal accumulated in the buffer section, a more nearly 
quality voice speech quality and a highly precise clock difference absorption function are cheaply realizable. 
[01 16] A more nearly quality voice speech quality and a highly precise clock difference absorption function are 
cheaply realizable with the 4th invention by that which can absorb the difference of the clock of a transmitting 
agency, and the clock of receiving-side equipment by performing insertion or abandonment for a sound signal 
according to the die length of the duration of the silent section of a sound signal, i.e., the silent section. 
[01 17] In the 5th invention, since it can discard sequentially from a sound signal with few problems even if it 
discards the sound signal of the buffer section by performing insertion or abandonment by responding between 
the owner sound silent section of the sound signal accumulated in the buffer section, the front hangover section, 
and a hangover division, a more nearly quality voice speech quality and a highly precise clock difference 
absorption function are cheaply realizable. 

[0118] In the 6th invention, when a sound signal distinguishes whether it is a sound signal by conversation, a 
sound signal and signals other than a sound signal can be distinguished, and the Takamichi talk quality can be 
offered. 



[01 19] In the 7th invention, since control in consideration of a facsimile protocol can be performed to facsimile 
signals other than a sound signal by analyzing the protocol of this facsimile signal while a sound signal judges 
whether it is a facsimile signal, the Takamichi talk quality can be offered. 

[0120] In the 8-1 2th invention, since the difference of the clock of a transmitting agency and the clock of a 
transmission place is absorbable based on the accumulated dose of the decoded sound signal which was 
accumulated in the buffer section whether a sound signal is inserted in this accumulated sound signal that was 
decoded, and by discarding, a more nearly quality voice speech quality and a highly precise clock difference 
absorption function are cheaply realizable. 

[0121] A more nearly quality voice speech quality and a highly precise clock difference absorption function are 
cheaply realizable with the 1 3th invention by that which can absorb the difference of the clock of a transmitting 
agency, and the clock of receiving-side equipment by performing insertion or abandonment for a sound signal 
according to the die length of the duration of the silent section of the decoded sound signal, i.e., the silent 
section. 

[0122] In the 14th invention, since it can discard sequentially from the decoded sound signal with few problems 
even if it discards a sound signal by performing insertion or abandonment to the sound signal with which the 
buffer section was decoded by responding between the owner sound silent section of the decoded sound signal 
which was accumulated in the buffer section, the front hangover section, and a hangover division, a more nearly 
quality voice speech quality and a highly precise clock difference absorption function are cheaply realizable. 
[0123] In the 15th invention, when the decoded sound signal distinguishes whether it is a sound signal by 
conversation, the decoded sound signal and signals other than a sound signal can be distinguished, and the 
Takamichi talk quality can be offered. 

[0124] In the 16th invention, since control in consideration of a facsimile protocol can be performed to facsimile 
signals other than the sound signal decoded by analyzing the protocol of this facsimile signal while the decoded 
sound signal judges whether it is a facsimile signal, the Takamichi talk quality can be offered. 



[Translation done.] 
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damages caused by the use of this translation. 

1 This docunnent has been translated by computer So the translation may not reflect the original precisely, 
2.**** shows the word which can not be translated. 
3.1n the drawings, any words are not translated. 



DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing 1] The block diagram of the voice transmission equipment of the gestalt 1 of operation. 

[Drawing 2] The flow chart which showed actuation of the buffer control section 1 5 in the gestalt 1 of operation. 

[Drawing_3l The block diagram of the voice transmission equipment of the gestalt 2 of operation. 

[Drawing 4] The flow chart which showed actuation of the buffer control section 1 5 in the gestalt 2 of operation. 

[Drawing 5] The block diagram of the voice transmission equipment of the gestalt 3 of operation. 

[Drawing 6] Drawing which expressed typically the threshold, and the owner sound / silent judging result used for 

the magnitude of a sound signal, an owner sound / silent judgment in the gestalt 3 of operation. 

[Drawing 7] The flow chart which showed actuation of the buffer control section 1 5 in the gestalt 3 of operation. 

[Drawing 8] The block diagram of the voice transmission equipment of the gestalt 4 of operation. 

[Drawing 9] The block diagram of the voice transmission equipment of the gestalt 5 of operation. 

[Drawing 10] The block diagram of the voice transmission equipment of the gestalt 6 of operation, 

[Drawing 11] The flow chart which showed actuation of the buffer control section 15 in the gestalt 6 of 

operation. 

[Drawing 12] The flow chart which showed another actuation of the buffer control section 1 5 in the gestalt 6 of 
operation. 

[Drawing 13] The block diagram of the voice transmission equipment of the gestalt 7 of operation. 
[Drawing 14] The flow chart which showed actuation of the buffer control section 15 in the gestalt 7 of 
operation. 

[Drawing 1 5] The block diagram of the voice transmission equipment of the gestalt 8 of operation. 
[Drawing 16] The flow chart which showed actuation of the buffer control section 15 in the gestalt 8 of 
operation. 

[Drawing 1 7] The block diagram of the voice transmission equipment of the gestalt 9 of operation. 
[DrawingJS] The block diagram of the voice transmission equipment of the gestalt 10 of operation. 
[Drawing 19] The block diagram of the conventional voice transmission equipment. 
[Description of Notations] 

10 An IP packet receive section, 1 1 A voice detecting element 12 The marker grant section. 13 silent coding 
sound signal generation section, 14 The buffer section. 15 The buffer control section, the amount Monitoring 
Department of 1 6 buffers, 1 7 Decode section. 
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